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Abstract 


We propose a new model for the coherent forecasting of both the implied volatility 
surfaces and the underlying asset returns. In the spirit of who 
are interested in the dependence of volatility indices (e.g. the VIX) on the paths of the 
associated equity indices (e.g. the S&P 500), we first study how implied volatility can be 
predicted using the past trajectory of the underlying asset price. Our empirical study reveals 
that a large part of the movements of the at-the-money-forward implied volatility for up to 
two years maturities can be explained using the past returns and their squares. Moreover, 
we show that up to four years of the past evolution of the underlying price should be used 
for the prediction and that this feedback effect gets weaker when the maturity increases. 
Building on this new stylized fact, we fit to historical data a parsimonious version of the 
SSVI parameterization of the implied volatility surface relying 
on only four parameters and show that the two parameters ruling the at-the-money-forward 
implied volatility as a function of the maturity exhibit a path-dependent behavior with 
respect to the underlying asset price. Finally, we propose a model for the joint dynamics 
of the implied volatility surface and the underlying asset price. The latter is modelled 
using a variant of the path-dependent volatility model of Guyon and Lekeufack and the 
former is obtained by adding a feedback effect of the underlying asset price onto the two 
parameters ruling the at-the-money-forward implied volatility in the parsimonious SSVI 
parameterization and by specifying a hidden semi-Markov diffusion model for the residuals 
of these two parameters and the two other parameters. Thanks to this model, we are able 
to simulate highly realistic paths of implied volatility surfaces that are arbitrage-free. 


Keywords: implied volatility modelling, SSVI, path-dependent volatility 


1. Introduction 


One of the many reasons of the success of the Black-Scholes model (Black and Scholes} |1973) is the 


existence of a one-to-one correspondence between the price C(K,T) of an European call option with 
strike K and maturity T and the volatility ø of the geometric Brownian motion modelling the dynamics 
of the underlying asset price (S;):>0 provided that (Sg — Ke~"?)+ < C(K,T) < So (r is the constant 
risk-free rate) which is guaranteed by absence of arbitrage opportunities. When this condition is satis- 
fied, the unique parameter o satisfying Cgs(K,T,o) = C(K,T), where Cgg denotes the Black-Scholes 
call option price, is called the implied volatility of the call option. By the put-call parity, the implied 
volatility of the put option is equal to the one of the call option with same maturity and strike. Although 
the implied volatility does not add any new information with respect to the option price, it is commonly 
used to quote option prices on the markets mainly because it allows to easily compare the value of two 
options with different underlying assets while the option price heavily depends on the underlying asset 


price level, making the comparison more difficult. If the Black-Scholes model was an accurate description 
of financial markets, the implied volatility should be the same for all options on a given asset regardless 
of the maturity and the strike. The computation of the implied volatility from market option prices 
shows that the implied volatility actually depends on the maturity and the strike which invalidates the 
Black-Scholes model. The so-called implied volatility surface (IVS) (K,T) > ogs(K,T) permits to fully 
describe the option prices on a given asset. 


It is also well-known that the level and the shape of the IVS varies with time. To be able to jointly 
model the time evolution of the IVS and the underlying asset price is key for applications covering asset 
allocation, risk management and hedging. First, such a model allows to backtest or study the P&L 
distribution of an investment stragegy involving options and the underlying asset. One can think for 
example of the strategy consisting in buying a stock and a put of strike Ky and selling a put of strike 
Kə with Ky < Kı but with same maturity (this is called a put spread). This strategy protects the 
investor against a drop in the underlying asset price down to the Kə threshold in exchange to a lower 
premium in comparison to just buying a put of strike Kı. By extension, the modelling of the IVS and 
the underlying asset price makes it possible to optimize an asset allocation strategy involving options. 
Another application relates to the design and the backtesting of hedging strategies for financial products 
(e.g. volatility swaps, options on the VIX, etc.) having a volatility risk which is measured by the Black- 
Scholes vega. To complete this non-exhaustive list, let us finally mention that an [VS-underlying model 
can also be useful in the insurance industry for: 


1. computing the equity volatility distribution over a one-year horizon to estimate the capital re- 
quirement within Solvency II internal models and 


2. assessing the time value of options and guarantees within insurance contracts and analyzing the 
underlying hedging strategies of long-term life insurance contracts embedding path-dependent 
options. 


1.1. Literature review 


Inspired by the market models of (1992) and {Brace et al.| (1997) for the interest rates term 
structure, oe ere (a and (1999) independently proposed a modelling 
framework for the joint dynamics of the IVS and the underlying asset price where both are solutions 
of stochastic differential equations (SDEs) where the drift and volatility coefficients are only functions 
of the time, the maturity and the strike or moneyness. In particular, no-arbitrage conditions on the 
drift are derived to guarantee the absence of arbitrage opportunities under the risk-neutral probability. 
A similar approach is adopted by (2001). More empirical studies include the papers from 
and (2003). The former applies a principal component analysis 
(PCA) to historical implied volatilities grouped in three maturity buckets and identifies two factors ex- 
plaining 78% of the smiles variation while the latter applies a common PCA and identifies three factors 
explaining more than 98% of the variations. To deal with the fact that the study of the dynamics of 
the IVS is a three-dimensional problem (time, maturity and strike), use a 
Karhunen-Loéve decomposition instead of a PCA. They show that the dynamics of IVSs can be well 
summarized by three orthogonal factors which can be interpreted as the level, the orientation (i.e. a 
positive shock of this factor increases the volatilities of out-of-the-money calls while decreasing those of 
out-of-the-money puts) and the convexity of the surface. The associated principal components exhibit 
persistence (i.e. autocorrelation) and mean reversion close to the one of an AR(1) process. Therefore, 
Cont and da Fonseca suggest to model each of the principal component as an Ornstein-Uhlenbeck pro- 
cess. extend this model by specifying the dynamics of the underlying asset price which 
shares noise terms with the dynamics of the IVS allowing in particular to account for the correlation 
between the underlying price and the volatility surface level. A second extension is provided by [Cont and] 
which allows to limit the number of scenarios with static arbitrages by resampling from a 
given set of IVSs scenarios using smaller weights for scenarios with arbitrages. Another way to address 
this modelling problem in the litterature is to resort to parametric or semi-parametric factors models, see 


e.g. [Hafner and Schmid] (2005), |Fengler et al.| (2007) or (2023). More recently, machine 


learning techniques such as GANs or neural SDEs have also been used to generate realistic simulations of 


implied volatility surfaces, see e.g. (2019), (2021) and|Choudhary et al.| (2023). 


Finally, let us mention the paper of |Morel et al.| (2023) who introduced the so-called Path Shadowing 
Monte Carlo method which, combined with a statistical model of prices, allows to make state-of-the-art 
predictions of option smiles using only the distribution of the price process. 


In this paper, we develop a new joint model of the IVS and the underlying asset price. Instead of 
specifying the IVS as the solution of a given SDE or as a linear combination of several factors (whether 
parametric, semi-parametric or non-parametric), we propose to consider a parameterization of the IVS 
whose parameters evolution depends on the path of the underlying asset price. The chosen parameter- 
ization is the celebrated SSVI parameterization of that is known to well 
reproduce observed IVSs and guarantees the absence of static arbitrage under mild conditions. This 
modelling paradigm consisting in making dynamic the parameters of a model fitting market data at 
some point in time is similar to the one of who developed a very general 
mathematical framework for designing consistent dynamic market models. In|Carmona et al.| (2017), the 
authors provide a practical implementation of this framework for IVSs allowing to simulate IVSs that 
are free of both static and dynamic arbitrage. Moreover, they use these simulations of IVSs to find the 
portfolio with smallest variance for a portfolio consisting of n options of same maturity but different 
strikes. In the same vein, used a SVI model whose parameters are stochastic 
processes to model the dynamics of the entire IVS. A convolutional LSTM (Long Short-Term Memory) 
neural network is used to learn the joint dynamics of these parameters and the underlying forward price. 
There is one main difference between our approach and the ones of these papers and the literature in 
general. In our approach, we introduce an explicit modelling of the impact of the underlying asset price 
onto the level and the shape of the IVS in the spirit of who focus on volatil- 
ity indices and realized volatility (hence not on IVS). Indeed, in the above litterature, the dependence 
structure between the IVS and the underlying asset price is generally captured through simple assump- 
tions such as a Gaussian copula, common noise terms or using the short-term implied volatility as a term 
in the underlying asset stochastic volatility dynamics. Moreover, we model the underlying price using 
the path-dependent volatility framework of which exhibits high statistical 
consistency and captures multiple historical stylized facts (leverage effect, volatility clustering, weak and 
strong Zumbach effects). Before giving more details on our approach, we find useful to dedicate a section 
to Guyon and Lekeufack’s main results. 


1.2. Guyon and Lekeufack’s path-dependent volatility model 


Guyon and Lekeufack| (2023) showed that the level of the volatility of major equity indices is essentially 
explained by the past variations of these equity indices, or in other words, they showed that volatility 


is mostly path-dependent. To be more specific, they consider two measures of the volatility: the value 
of an implied volatility index such as the VIX and an estimator of the realized volatility over one day 
using intraday observations of the equity index. We recall that an implied volatility index is a measure 
of the expected future variance of a given underlying index (for example the S&P 500 for the VIX) 


at a given horizon T. Mathematically, the expected future variance writes E E fo opat| where ø is 


the instantaneous volatility of the underlying index and E here denotes the expectation under the risk- 
neutral probability. The expected future variance can be estimated from the prices of traded calls and 
puts on the underlying o using Carr and Madan| (2001) aa We refer for example to the 
documentation of the VIX (CBOE] |2023) or the VSTOXX (STOXX| |2023) for more details. Note that 
Guyon and Lekeufack aye use ene term implied volatility indices ae horizon T is below 30 days) 
since they are interested in the modelling of the instantaneous a Let us now introduce the model 
that they calibrate for both measures of volatility. Let (S;):50 be the price process of an equity index 
and Volatility, be one of the two above-mentioned measures of volatility. The Path-Dependent Volatility 


(PDV) model from the empirical study of|Guyon and Lekeufack| (2023) writes as follows: 


Volatility, = Bo + BiRit + BX. (1.1) 
The features Ry and ¥; are defined on a time grid (t;)ien as follows: 


e RF,» is a trend feature given by: 
Rit = NO Kilt — tira, (1.2) 


ti<t 
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Figure 1: Values of the VIX against values of the features R, and X on the train set. The blue line 
represents E[Y | X] (obtained by a locally weighted scatterplot smoothing) when displaying Y vs X. 


where ra = (Sun — St,_,)/S:,_, and Kı : Ry — R+ is a decreasing kernel weighting the past 
returns. This feature allows to capture the leverage effect, i.e. the fact that volatility tends to rise 
when prices fall. 


e >; is an activity or volatility feature given by: 


De = [X Ka(t—ti)r?, (1.3) 


ti<t 


where Ko is also a decreasing kernel. This feature allows to capture the volatility clustering 
phenomenon, i.e. the fact that periods of large volatility tend to be followed by periods of large 
volatility, and periods of small volatility tend to be followed by periods of small volatility. 


In order to capture both the short and long memory of volatility, they propose a time-shifted power law 
(TSPL) for the two kernels K, and Ko: 


La 
Kir) = — f= 1,2, (1.4) 


with Z4,,5, the normalization constant such that >?) c¢<;,<,Kj(t—ti)A = 1 where A = 1/252 (business 
days frequency) and C is an hyperparameter (called the cut-off lag later in the paper) controlling at 
which point the sums in R; and È are truncated. 


In order to measure to which extent the two features of the PDV model allow to explain the variations 
of the volatility, they use the R? score whose formula is recalled below: 


ears = gi)? 
Geta (1.5) 


where y = (y;)1<i<n are the observed data, O = (ĝi)ı<i<n are the predicted data and Yn = DD Yi. 
When they calibrate the PDV model on implied volatility indices data, they obtain R? scores over tested 
indices that are above 87% on the train set (January 1, 2000 to December 31, 2018) and above 80% on 
the test set (January 1, 2019 to May 15, 2022), which shows that the PDV model explains a large part 
of the variability observed in the volatility dynamics. In Figure [I] we reproduce two graphs from their 
paper that indicate quite clearly the linear relationship between the two features and the VIX. When 
calibrated on realized volatility data, the performance of the PDV model is reduced: the R? score is 
about 70% on the train set and 60% on the test set. 


R? (y, ĝ) = 


1.3. Contributions 


The first contribution of the present paper is an empirical study of the dependence of implied volatility 
on the past movements of the underlying asset price for options on the S&P 500 and options on the 


Euro Stoxx 50. This empirical study is inspired by the one of|Guyon and Lekeufack| (2023) but there are 


several differences. First and foremost, we work on implied volatility instead of implied volatility indices: 
the former represents the price of an option and is determined by supply and demand while the latter 
represent measures of the expected future variance (see previous section) and are determined as linear 
combinations of prices of calls and puts covering the liquid strikes and the two time-to-maturities that 
are the closest to 30 days. Both are therefore close only if we consider implied volatilities of 1-month 
maturity options. Since we consider maturities up to 24 months, our study can be seen as an extension 


of the one of (Guyon and Lekeufack| (2023). Second, we analyze the influence of the cut-off lag of the 


kernel on the performance of the PDV model. Finally, we add a regularization term in the calibration 
of the model and study its impact. Our study also differs from the one of (2000) because 
they focus mostly on the frequency with which call (resp. put) prices move in the same direction (resp. 
opposite direction) as the underlying asset price but they do not try to exhibit a functional relationship 
between the two and they do not use the past path of the underlying asset price. 


The second contribution is to propose a parsimonious version of the Surface Stochastic Volatility 
Inspired (SSVI) parameterization of the IVS which relies only on four 
parameters and provides a reasonable replication of the market IVSs for a wide range of dates. This 
parsimonious SSVI parameterization is free of static arbitrage provided that a simple inequality con- 
straint involving two parameters is satisfied. Moreover, it is consistent with the well-known power-law 
decay of the at-the-money-forward (denoted by ATM in the sequel for the sake of simplicity) skew (see 


e.g. |Gatheral et al.| (2023)). We also show that the two parameters governing the ATM implied volatil- 


ity curve as a function of the maturity can be well explained by the past path of the underlying asset price. 


Our final contribution is to introduce a new model for the joint dynamics of the underlying asset 
price and the implied volatility surface allowing to perform Monte Carlo simulations under the real-world 
probability. This model is obtained by specifying the time evolution of the four parameters of the parsi- 
monious SSVI parameterization for the IVS and combining it with a variant of the PDV model of 
for the underlying price. The dynamics of the two parameters governing the ATM 
implied volatility curve contains a functional dependence on the past path of the underlying price allow- 
ing to embed in the model the feedback effect that we observe on historical data. Moreover, the residuals 
of these two parameters along with the two others parameters of the parsimonious SSVI parameterization 
are modelled using a hidden semi-Markov process. Together with the model specification, we also provide 
a calibration methodology for all the parameters that are involved in the dynamics. Ultimately, we show 
through sample paths and quantile envelopes that the IVSs simulated with our model are highly realistic. 


This paper is organized as follows: in Section [2] we start by the empirical study of the dependence 
of implied volatility on the past movements of the underlying asset price. Then, we present the SSVI 
parameterization and its parsimonious version as well as some calibration results in Section [8] Finally, 
Section f]is dedicated to the introduction of our new path-dependent SSVI model for simulating implied 
volatility surfaces and the underlying asset price. 


2. Empirical study of the joint dynamics of the implied volatility 
and its underlying index 


2.1. Data sets 


We consider two data sets from Refinitiy] of daily implied volatility surfaces corresponding to options 
on the S&P 500 index and the Euro Stoxx 50 index respectively. These data sets start on March 8, 2012 
and end on December 30, 2022. They contain the at-the-money-forward (denoted by ATM in the sequel 
for the sake of simplicity) implied volatilities for maturities ranging from 1 month to 24 months with a 
monthly timestep. For the same range of maturities, the data sets also contain the implied volatilities for 
Black-Scholes deltas in the following range: +0.1, +0.15, +0.2, +0.25, +0.3, +0.35, +0.4, +0.45 (positive 
deltas correspond to calls while negative deltas correspond to puts). As a remainder, the Black-Scholes 
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delta corresponds to the sensitivity of the option price with respect to the price of the underlying asset. 
Its formula is recalled below: 


In $ + (r+ Z )T 
oVT 


where M is the cumulative normal distribution function, € = 1 for call options and —1 for put options, 
K is the strike, r the constant risk-free rate, T the maturity and ø the Black-Scholes implied volatility. 
In the sequel of this section, we only focus on the ATM implied volatilities but, in Sections [3] and |4| the 
away-from-the-money implied volatilities will also be used. Note that in practice, options with the above 
maturities can not be traded every day on the market. The mapping between the quotes of the options 
that are actually traded and the quotes in our database is at the discretion of the data provider. For 
example, on the Chicago Board Options Exchange where calls and puts on the S&P 500 are traded, the 
following options can be traded at day t: 


APS = eN (edi) with dı = (2.1) 


1. Weekly expiry options: options expiring every business day between t and t+ 28 business days. 
Note that before 2022, there was only Monday-, Wednesday- and Friday-expiring options. 


2. End-of-Month options: options expiring the last business day of the month for up to twelve 
months after t. 


3. Monthly expiry options: options expiring the third Friday of the month for a given range of 
future months up to 5 years after t. 


A similar decomposition can be found for options written on the Euro Stoxx 50 on the Eurex but with 
differences in the expiry dates (for example, there are only Weekly expiry options that expire on Fridays). 


Along with these two IVSs data sets, we also have daily time series of the S&P 500 and Euro Stoxx 
50 indices. The S&P 500 time series starts on January 2, 1980 while the Euro Stoxx 50 time series starts 
on December 31, 1986 and both end on December 30, 2022. Note that since we will use at most 12 years 
of past returns to predict the implied volatility, the whole time series are not used in the following study. 


To measure the out-of-sample performance of the tested model, we split the two data sets into a 
train set and a test set: the train set spans the period from March 8, 2012 to December 31, 2020 and 
the test set spans the period from January 1, 2021 to December 30, 2022 so that approximately 80% of 
the data is used for the train and 20% is used for the test. In addition, we will also consider a blocked 
cross-validation in Section The 1-month ATM implied volatility along with the underlying asset 
price are represented in Figure [2] for both data sets. 


2.2. Calibration methodology 


The PDV model (1.1) with the TSPL kernel relies on 7 parameters, namely (a1, 61, 02,62) the param- 
eters of the two TSPL kernels K, and Ko (Equation (-4)) and (8o, 61, G2), respectively the intercept, 
the sensitivity to the trend feature and the sensitivity to the volatility feature. These 7 parameters 
are calibrated specifically for each maturity using the following steps (which are identical to the ones 


implemented by |Guyon and Lekeufack| (2023) to which we refer for more detailg”): 


1. We compute four exponentially weighted moving averages (EWMA) with respective spans of 10, 
20, 120 and 250 days of the underlying index returns. Then, we run a ridge regression of the 
ATM implied volatility on the four EWMAs and we fit the TSPL kernel K; on the optimal linear 
combination of the exponential kernels which provides us with initial guesses for a1, 6,. The use 
of a ridge regression instead of a lasso regression is justified by the fact that we do not need to 
maximize the number of zeros (i.e. minimize the number of exponential kernels) in view of the 
subsequent fit of a TSPL kernel. By running a ridge regression of the ATM implied variance on 
four EWMAs of the underlying index squared returns and fitting the TSPL kernel K, we obtain 
similarly initial guesses for az and 2. 


See also the code provided with their paper: https://github.com/Jordylek/ 
VolatilityIsMostlyPathDependent 
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Figure 2: Joint evolution of the 1-month ATM implied volatility and its underlying index from March 8, 
2012 to December 30, 2022. The split between the train and the test sets is represented through the use 
of different background colors. 


2. Initial guesses for 69, 3, and $2 are then obtained using a linear regression of the ATM implied 
volatility on the features Rı and © where ay, 61, a2 and dz are fixed to the values estimated at 
step 1. 


3. Starting from these initial guesses, the 7 parameters are jointly calibrated by solving the following 
minimization problem using the least_squares function with the trust-region reflective algorithm 
from the scipy Python package: 


min TVr** — By — Bi Riz — bo Xa)? 
(a11,61,02,52,80,81,82)ER™ | t Bo By 1,t b2 t) 
s.t. Qj, Ôj > Oforje {1,2} 
Ze ô 
Rit = Lad O, (2.2) 
2 (t —t+ 64) 
Za ô 

yy = 2,02 r2 
eon (t — ti + ô2)°? ý 


where Tirain is the set of dates in the train set, [V/"*' is the market ATM implied volatility 
observed at time t for some fixed maturity and C is a cut-off lag. 


2.3. Numerical results 
2.3.1. Performance of the PDV model 


We start by calibrating the PDV model using the methodology described in Section [2.2] Note that 
the computation of the features Rı and È requires to truncate the sums at some point parameterized by 
C. We use for the moment the previous 1,000 business days (i.e. C = 1000), consistently with the choice 
of (2023), but we will discuss later the influence of this hyperparameter. The 
performance of the model is measured using the R? score (the definition is recalled in Equation (1-5) 
which allows to assess how much of the variance of the implied volatility is explained by the model. The 
results are presented in Figure [3] For the S&P 500, we obtain R? scores between 85% and 93% on the 
train set and between 62% and 77% on the test set. For the Euro Stoxx 50, we obtain R? scores between 
85% and 90% on the train set, between 70% and 81% for the 15 first maturities on the test set and 
between 50% and 70% for the last maturities. These results indicate that a large part of the movements 
of the ATM implied volatility can be explained by the past movements of the underlying asset price. In 
this regard, they extend those of to ATM implied volatility data. We also 
notice that the R? scores are overall decreasing with the option maturity: this is quite natural as we 
expect long-term options to be less sensitive to the variations of the underlying asset price than short- 
term options. This observation is also consistent with the results of [Bakshi et_al.| who noticed 
that "the longer an option’s remaining life, the more likely its price goes in the opposite direction with 
the underlying asset" suggesting that there is more exogeneity in the evolution of the prices of long-term 
options than in those of short-term options. 


10 1.0 


+ 
eo eer PePevecccves eoeerecvece 
Ce Ceeces, oo? COP POCO ee eee 
0.8 4 0.8 }--@-@ 00- y 
ee, eeeert Poesy 
e YERERE * e 
EEEE ETE Da e., ast 
0.6 +--+ 0.6 4 kd 
R es *e 
a 
0.4 4 0.4 4 
0o24 ° Train o2+ ° Train 
` e Test j e Test 
+ VIX - Train (Guyon & Lekeufack) + VSTOXX - Train (Guyon & Lekeufack) 
+ VIX - Test (Guyon & Lekeufack) + VSTOXX - Test (Guyon & Lekeufack) 
0.0 -+ T t t t t 0.0 -+ t t t t t 
0 5 10 15 20 25 0 5 10 15 20 25 
Maturity (in months) Maturity (in months) 
(a) S&P 500 (b) Euro Storx 50 


Figure 3: R? scores on the train and the test sets as a function of the ATM implied volatility maturity. 
The average R? scores for the VIX and the VSTOXX are also displayed. 


The two following subsections deepen the analysis of Figure 


2.3.2. Comment on the gap between the scores on the train and the tess sets 


We observe a gap of approximately 22% for the S&P 500 and 19% for the Euro Stoxx 50 between the 
R? scores on the train set and the test set. Such gaps are usually symptomatic of overfitted models. 
However, if we keep only one feature to reduce the complexity of the model, be it the trend feature R or 
the volatility feature ©, the R? scores are lower (especially with the trend feature) and the gap between 
the train and the test sets widens as shown in Figure [| for the S&P 500 (similar results are obtained for 
the Euro Stoxx 50). Another way to deal with overfitting is to add a regularization term in the objective 
function. Such technique is implemented in the following section but does not reduce the gap between 
the performance on the train and the test sets. Because of these two arguments, we estimate that the 
observed gap is not the result of an overfitted model but rather the result of the fact that the train set is 
small (only 8 years of data) and that the test set is of a peculiar nature. Indeed, the test set corresponds 
to the post-Covid-19 period which is characterized by a lot of uncertainty related to the Russia-Ukraine 


war, inflation, the rise of interest rates, etc. which may have affected the extent to which the volatility 
reacts to the underlying index movements. For the S&P 500, the difference between the evolution on the 
train set and the test set is very clear: apart from the crash of March 2020, the S&P 500 has experienced 
a constant increase with very little variations on the train set while the test set is characterized by a 
bull market followed by a bear market with high volatility. Note that the difference between the periods 
is however less clear for the Euro Stoxx 50. This claim is supported by Figure [5] representing the S&P 
500 and the Euro Stoxx 50 1-month implied volatilities against the two features Rı and X on the test 
set. The shape of the cloud of data points (green dots) indicates a linear relationship with respect to 
the trend and volatility features which argues in favor of the validity of the PDV model. However, the 
majority of these data points are above the plane fitted on the train set (especially for the S&P 500 which 
is consistent with the larger gap between the R? score on the train and the test sets) which indicates that 
the implied volatility has reacted more strongly on the underlying index movements in the post-Covid-19 
period. To quantify this observation, we compute the ratio D of the signed distances and the absolute 
distances between the observed implied volatilities and the predicted implied volatilities: 


= ee IVe — By — Bi Riz — b2% 
oe [IV — Bo — B1 Ra, — 2¥:| 
We obtain a ratio of 18.4% for the S&P 500 and 4.1% for the Euro Stoxx 50 on the test set which is 


consistent with the observation that more observed implied volatilities are above the predicted value 
than below. 
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Figure 4: R? scores on the train and the test sets of the S®P 500 as a function of the ATM implied 
volatility maturity for the PDV model with only one feature. 


2.3.3. Comparison with the scores of|Guyon and Lekeufack| (2023) 


In Figure|3| we also represent (with green and red crosses) the R? scores obtained when calibrating the 
PDV model on the VIX and the VSTOXX (which are the volatility indices of the S&P 500 and the 
Euro Stoxx 50 respectively) using the same historical time period (from March 8, 2012 to December 30, 


2022). They allow a consistent comparison between our scores and those of}|Guyon and Lekeufack| (2023). 


Note that the scores are represented at the same abscissa as the scores obtained on the 1-month implied 
volatilities. This choice is motivated by the fact that both the VIX and the VSTOXX are measures of 
the 30-days expected variance of their respective underlying index. We observe that the R? scores on 
the test set for the volatility indices are: 


1. below the scores for the 1-month implied volatility and 


2. below the scores reported by (Guyon and Lekeufack| (2023) for the same indices (the difference 


being only the train and test periods: their train and test sets respectively span the periods from 
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Figure 5: 1-month ATM implied volatility level on the test set as a function of the trend feature Rı and 


the volatility feature %. The plane corresponds to the predicted values when the PDV model is fitted on 
the train set and the green dots correspond to the data points. 


January 1, 2000 to December 31, 2018 and from January 1, 2019 to May 15, 2022 while our train 


and test sets respectively span the periods from March 8, 2012 to December 31, 2020 and from 
January 1, 2021 to December 30, 2022). 


The first observation can be explained by the fact that the 1-month ATM options can be traded on the 
market while the VIX and the VSTOXX are calculated as a non-linear combination of prices of calls 
and puts covering the liquid strikes and the two time-to-maturities that are the closest to 30 days (see 
and [STOXX] (2023) for more details). Thus, the effect of the underlying asset movements 
is intuitively less direct on the VIX and the VSTOXX than on the 1-month implied volatility. The second 


observation can be understood in the light of the arguments that have been put forward to explain the 
gap between the R? scores on the train and the test sets. 


2.3.4. Influence of the cut-off lag 


We mentioned at the beginning of Section 2.3. ]]that we truncated the sums in the expressions of R; and X 
after the previous C = 1,000 business days. In the following, we study the impact of the hyperparameter 
C, which we call the cut-off lag. Remark that if the cut-off lag is too small, there is a risk to lose some 
information from the past, while if the cut-off lag is too big, there is a risk to capture some information 
that is actually not relevant to predict the implied volatility. First, let us point out that there is a 
priori no reason to use the same cut-off lag for Rı and X. Therefore, we consider two different cut- 
off lag hyperparameters Cr, and Cy. In order to measure their influence on the performance of the 
model, we run a 10-fold cross-validation. Before describing this procedure in more details, we introduce 
a third hyperparameter A that allows to penalize large values of the kernels parameters a1, 61, @2 and 


62 during the calibration. More specifically, we add a L? penalization term in the objective function so 
that Equation (2.2) becomes: 


(2.4) 


t€Ttrain 


2 2 
min TV" — Bo — B1 Rit — Poesy +À +N & 
(a1 ,61,02,62,80,61,82)ER™ L ( : Bo — Pr oe Ba o) 2 i 2 7 
s.t. aj,d; >20 for j € {1,2} 


Za, Basa 2 : y 
— 15°12 = 2:92 
where Ris = > t-Or <ti<t WREST ti and X; = TEE era e The introduction of 


this penalization is motivated by the fact that we want to avoid overfitting as mentioned in Section 
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Note that this modified objective function is minimized using the minimize function with the L-BFGS-B 
algorithm from the scipy Python package. Let us now describe the 10-fold cross-validation. For each 
maturity, the train set is split into 10 adjacent folds of same size (222 days each) and for each triplet 
(Cr,, Cs, A) € {5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2000, 2500, 3000}? x {1076, 10-°,..., 1071} and for 
alli € {1,..., 10}, we calibrate on all folds except fold i and we compute the R? score on the fold i. This 
procedure corresponds to the so-called blocked cross-validation (see e.g. (2020). Then, 
we average the R? scores over the 10 folds so that we obtain one score per triplet (Cr, , C9, A). In Table 
we present the triplet (Cr, , Cs, À) leading to the best average R? score for each maturity. 


Table 1: Hyperparameters allowing to achieve the highest average R? scores on the test fold within the 
10-fold cross-validation procedure. 


S&P 500 Euro Stoxx 50 S&P 500 Euro Stoxx 50 
Maturity Cr, Cy AÀ Cr, Cy AÀ Maturity Cr, Cy AÀ Cr, Cy AÀ 

1M 50 500 10-3 | 50 250 107% 13M 100 1000 1073 10 1000 107? 
2M 50 2000 10-3 | 50 250 10-2 14M 100 1000 1073 10 1000 107? 
3M 50 2000 10-2] 25 1000 107° 15M 100 1000 1073 10 1000 107? 
4M 50 2500 1074| 25 1000 10-8 16M 100 1000 10 | 500 2000 107 
5M 100 2500 1075| 25 1000 106 17M 100 1000 107? | 500 2000 107 
6M 100 2500 1076| 25 1000 1076 18M 100 1000 107? | 500 2000 107 
7M 100 2500 1076 10 1000 10-6 19M 100 1000 107? | 500 2000 107 
8M 50 1000 1074| 10 1000 1074 20M 100 1000 107? | 500 2000 107 
9M 100 1000 10-5 10 1000 10-4 21M 100 1000 107? | 500 2000 107 
10M 100 1000 1074| 10 1000 107% 22M 100 1000 10 | 500 2000 107 
11M 100 1000 107| 10 1000 107? 23M 2000 1000 107? | 500 2000 107 
12M 100 1000 1078 10 1000 1074 24M 2000 1000 107? | 500 2000 107 


First, it is remarkable that the cut-off lags Cg, all are below 100 days except for the largest maturities. 
Second, the cut-off lags Cy are above 1,000 days for all maturities except the first one for the S&P 500 
and the first two for the Euro Stoxx 50. Looking at the average R? scores for all tested triplets, we 
observed that the fitting quality is very sensitive to the cut-off lag Cs of the volatility feature while the 
two other hyperparameters Cpr, and à have a smaller influence (especially A which can explain the wide 
range of values obtained for this hyperparameter in Table [1}. Choosing a too small cut-off lag Cs can 
lead to very low R? scores, especially for the largest maturities. For example, for Cy = 500, we obtain 
R? scores that are even below 0 for the Euro Stoxx 50, as illustrated in Figure g Actually any value 
Cy strictly below 1000 in the grid {5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2000, 2500, 3000} yields overall 
poor results such as those exhibited in Figure [6] and this regardless of the value of C'r,. This indicates 
that the squared returns up to 1,000 business days in the past are paramount to predict the implied 
volatility, particularly for the largest maturities. 


In order to verify that this conclusion is not an artefact of a bad model calibration, we present in 
Figure [7] the correlation between the implied variance and the squared daily returns for all lags between 
0 and 3,000 days both on the train and the test sets. Note that the estimated correlation p is presented 
along with a 95% confidence interval derived from the transformation z = artanh(p) introduced by [Fisher| 
(1915). Indeed, this transformation is approximately normally distributed when the samples come from 
a bivariate normal distribution. Although this is not the case here, we consider it as a reasonable proxy 
of the uncertainty around the correlation estimator. On the train set (blue curve), the graphs show a 
slow decrease of the correlation with the lag, with several spikes at some specific lags that become larger 
when the maturity increases. A first spike can be seen around 250 days, i.e. 1 year, especially for the 
Euro Stoxx 50. Then a smaller spike can be seen around 500 days, i.e. 2 years. A third spike appears 
around 750 days, i.e. 3 years, which is characterized by a slower decay than the previous spikes as it only 
fades around 1,250 days. This third spike is even higher than the previous spikes for large maturities. 
After these three spikes, we observe again smaller spikes around 1,750 days and 2,500 days and again a 
big spike around 2,750 days, that is a spike almost every year. These observations are consistent with 
the sensitivity of the model to the cut-off lag Cy and the values obtained with the cross-validation that 


11 


1.0 


(EEEE ETT @ Train 
e ee®eeeee 
oeoo, 2 Pee, CCC CC OC CCC oCLCe 
0.8 4 ce 
eo fe 
e Ce 0.5 4 o 
e 
Sa ° vie 
e e 
0.6 4 se e m 
e 0.07 é 
x e 5 ° 
<= o44 e © e 
e . 
-0.5 
Co e 
ea 
0.24 oo, ° 
itt -1.04 . 
e e 
e 
0.0 4 o @ Train e 
e -1.54 @ Test 2 
o 5 10 15 20 25 o 5 10 15 20 25 
Maturity (in months) Maturity (in months) 
(a) S&P 500 (b) Euro Stoxx 50 


Figure 6: R? scores on the train and the test sets as a function of the ATM implied volatility maturity 
for Cr, = Cs = 500 and A= 0. 


are presented in Table [I] Note that these observations have the advantage of not depending on any as- 
sumption. Thus, we can consider the long-range dependence of the implied volatility to the past squared 
returns as a stylized fact of our implied volatility data. An empirical study of a larger set of underlying 
assets could reveal whether this is a universal property of implied volatility data. To our knowledge, 
this stylized fact has never been reported in the literature. Let us however mention the work of 
who calibrate a 3-factors model on implied volatility data and show a long-range 
dependence in the level and absolute returns of the factors loading series. A possible explanation of 
this phenomenon is that options on widespread equity indices and with relatively large maturities are 
presumably traded by long-term investors such as asset managers, pension funds, sovereign funds, etc. 
who have a low rebalancing frequency of their portfolios and consequently, who base their investment 
decisions on the previous years returns of the underlying asset rather than the previous days returns. 
On the other hand, options with shorter maturities are presumably less traded by long-term investors 
so that the movements of the implied volatility are more influenced by short-term investors such as 
hedge funds who base their investment decisions on recent data. This is in line with Figures and 
as well as with the smaller values of Cy for the smallest maturities in Table In oe we 
also present the correlations between the implied volatility and the daily returns for all lags between 0 
and 3,000 days both on the train and the test sets. In this case, the correlations fades very quickly with 
the lag and we do not observe material spikes. This is consistent with the smaller values of Cr, in T able[í] 


So far, we have only described the correlations on the train set but as already discussed extensively, 
the test set is quite different and the correlations on this set (in orange in ki and [8}) are therefore 
also distinct from those calculated on the train set. In particular in Figure |7| we observe negative 
correlations with the 250 days lag which can be understood as a consequence of the fact that the implied 
volatility was decreasing in 2021 due to the leverage effect while one year earlier the squared returns were 
increasing with the Covid-19 crisis. Conversely, the implied volatility increased in 2022 again due to the 
leverage effect while one year earlier the squared returns were decreasing with the post-Covid-19 bull 
market. Given the particular profile of the correlation structure, it is natural to consider a variation of 
the PDV model that allows to capture the spikes. We have considered two variations of the PDV model 
allowing to capture the spike at the 3-years lag as it represents the larger spike. These two variations 
consist in adding a third feature which is the same as the volatility feature © except that the kernel 
weights specifically one period in the past. The two kernels that we have considered are the following: 


K}(T) = (ar — 5)t exp(—Ar) and K}(T) = exp (=) (2.5) 


where a, 6, À, u and o are non-negative parameters. The calibration of these alternative models only 
provides a small improvement of the R? scores on the train set and even a deterioration of the R? scores 
on the test set, likely due to the specificity of the test set mentioned earlier. As a consequence, these 
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alternative models are disregarded in the sequel. 
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Figure 7: Correlation between the squared ATM implied volatility and the squared daily returns as a 
function of the lag both on the train and the test sets. The 95% confidence intervals is derived from the 
Fisher transformation. 
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Figure 8: Correlation between the ATM implied volatility and the daily returns as a function of the lag both 
on the train and the test sets. The 95% confidence intervals is derived from the Fisher transformation. 
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2.3.5. Study of the calibrated parameters 


We conclude this empirical study by analyzing the calibrated parameters of the PDV model. In order to 
obtain comparable model parameters between maturities, we retain a single triplet (Cr,,Cy, A) for all 
maturities. This triplet is selected as follows. For each maturity, we compute the average R? score over 
the 10 folds of each triplet and then, we average these scores over all maturities. Finally, for each triplet 
(Cr,, Cz, A), we average the obtained score with the one of the triplets (CR, , Cs, à’) such that CZ = Cy 
and either Cr, is the closest value above or below Cr, in the grid {5, 10, 25, 50, 100, 250, 500, 1000, . . . , 2500, 3000} 
or À’ is the closest value above or below A in the grid {10~°,10~°,...,10~'}. For example, the score 
of the triplet (50,500,10~°) is averaged with the one of the triplets (25,500,10~%), (100,500, 107°), 
(50,500, 1074) and (50,500, 10-2). The triplet that is chosen for all maturities is the one achieving the 
higher score through this procedure. This procedure aims at selecting a triplet whose performance is not 
too sensitive to a modification of Cpr, or À and is introduced because we observed that if we consider only 
the average score over all maturities, the performance of the obtained triplet was very sensible to these 
two hyperparameters (unlike most triplets as underlined earlier) and was quite bad on the test set. This 
instability is probably due to the small size of the train set which is divided in 10 folds in the blocked 
cross-validation. With this procedure, we obtain (Cr,,Cs, A) = (100, 1000, 1074) for the S&P 500 and 
(Cri, Cs, à) = (10,1000, 10~%) for the Euro Stoxx 50. In Figure [o] we show the R? scores obtained 
on the train and the test sets with these hyperparameters. We notice overall a small deterioration in 
comparison with Figure [3] where (CR, Cx, A) = (1000, 1000, 0). This deterioration can be attributed to 
the fact that the hyperparameters are optimized on the train set only which differs in several ways from 
the test set. 
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Figure 9: R? scores on the train and the test sets as a function of the ATM implied volatility maturity. 
The hyperparameters for the S&P 500 are (Cr,, Cs, A) = (100, 1000, 1074) and those of the Euro Stoxx 
50 are (Cr,,Cs, A) = (10, 1000, 1078). 


In Figures [10|and we plot the evolution of the calibrated parameters associated to the R? scores 
presented in Figure |9| as a function of the maturity. Regarding the TSPL kernels parameters (a1, 61, 
a2, 02), we observe overall a decreasing trend except for 62 for which there is no clear trend. This 
decreasing trend for a, and a2 indicates that far away past returns explain more and more the ATM 
implied volatility movements as the maturity increases. Note that, for the largest maturities, we obtain 
values of a that become even lower than 1 (except for a; for the S&P 500) which is the critical value 
below which the integral of the TSPL kernel diverges in continuous time. The decreasing behavior of ô 
indicates that, as a, decreases, it still matters to keep a large weight for the more recent returns. Let 
us now end the study with the analysis of the parameters 6o, 6, and (2. First, we notice that we have 
Bı < 0 and 62 > 0 without imposing any constraint on these parameters. Therefore, a positive (resp. 
negative) trend in the underlying asset price tends to be followed by a decrease (resp. increase) of the 
implied volatility (which is consistent with the negative correlation observed by 
(2002)) while the increase (resp. decrease) of the underlying asset volatility (measured by the squared 
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Figure 10: Calibrated TSPL kernels parameters as a function of the maturity. The hyperparameters for 
the S&P 500 are (Cr,, Cs, A) = (100,1000, 1074) and those of the Euro Storr 50 are (Cr,, Cs, AÀ) = 


(10, 1000, 1078). 
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Figure 11: Calibrated Bo, 8, and Bg as a function of the maturity. The hyperparameters for the S&P 500 
are (Cr,,Cs,) = (100, 1000, 1074) and those of the Euro Stoxx 50 are (Cr,,Cz,) = (10, 1000, 107%). 


returns) tends to be followed by an increase (resp. decrease) of the implied volatility. Moreover, the three 
parameters for the small maturities are of the same order of magnitude as to those calibrated by [Guyon] 
(2023). Regarding the evolution of 89, we obtain an overall increase with the maturity 
which reflects the fact that, in average, the level of ATM implied volatility increases with the maturity. 
The parameter 6), which can be interpreted as the influence of the trend feature on the implied volatility, 
is getting closer to 0 with the maturity, implying that the implied volatility for long maturities becomes 
less reactive to the trend in the returns of the underlying asset price. Finally, the parameter 62, which 
can be interpreted as the influence of the volatility feature on the implied volatility, is mainly decreasing 
with the maturity, so it seems that the implied volatility for long maturities becomes less reactive to the 
volatility of the underlying index. 
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The empirical study conducted in this section allowed us to exhibit the dependence of the ATM 
implied volatility on the past path of the underlying asset price for two major financial indices. We 
showed that this dependence decreases with the maturity but remains material even for the largest 
maturities. Moreover, the feedback effect of the underlying price onto the ATM implied volatility has a 
very long memory: up to 4 years of the past evolution of the underlying price should be used to predict 
the ATM implied volatility for the largest maturities. At this stage, it is natural to ask whether these 
conclusions still hold for away-from-the-money implied volatilities. Instead of reproducing the empirical 
study for each maturity and strike (which would increase significantly the dimension of the study), we 
study the performance of the PDV model in explaining the evolution of the calibrated parameters of 


the SSVI parameterization of |Gatheral and Jacquier} (2014). The following section is dedicated to the 


presentation of this parameterization. 


3. Calibration of market implied volatilities with the SSVI pa- 
rameterization 


The purpose of this section is to introduce the SSVI parameterization and to present some calibration 
results of this model on the implied volatility historical data that we considered in Section [2] We start 
by some remainders about static arbitrages as the ability to generate arbitrage-free implied volatility 
surfaces (IVSs) is one of our motivations for considering the SSVI parameterization. 


3.1. Static arbitrages 


An IVS is free from static arbitrage if there is no arbitrage opportunity by static trading of call and put 
options with prices given by inserting their implied volatility in the Black-Scholes formula. The formal 
definition of absence of static arbitrage is provided below. 


Definition 3.1 (Absence of static arbitrage). Let us denote Cgs(K,T,c) the Black-Scholes price of an 
European call option of strike K and maturity T when the constant volatility is o. An IVS (K,T) > 
ops(K,T) is free of static arbitrage if there exists a non-negative martingale, say (St)t>0o, on some 
filtered probability space (Q, F,P) such that C(K,T) := Cgs(K,T,ogs(K,T)) = Ele7"? (Sr — K)*] for 
all K,T > 0 and where r is the risk-free interest rate. 


(2010) provides sufficient conditions under which an IVS is free of static arbitrage. 


Theorem 3.1 (Theorem 2.9. from{Roper| 2010). Consider the total implied variance defined by w(k,T) = 
oR s(k,T)T where ops(k,T) is the Black-Scholes implied volatility associated to the log-strikẸ| k and the 
maturity T. If w: Rx R} > Ry satisfies the following conditions: 


(i) w(-,T) is of class C? for all T > 0, 
(ii) w(k,T) > 0 for all (k,T) € Rx Ri, 
(iii) for each (k,T) € R x Ri, 


a 


(3.1) 


(iv) w(k,-) is non-decreasing for each k € R, 
(v) -k/y/w(k, T) + yw(k, T)/2 aa for all T > 0 and, 
s—>+00 


(vi) w(k,0) =0 for allk € R, 
then the total implied variance surface w is free of static arbitrage. 


Remark 3.1. An IVS is said to be free of butterfly arbitrage if conditions (iti) and (v) are satisfied 
and it is said to be free of calendar spread arbitrage if condition (iv) is satisfied. To our knowledge, this 


terminology is due to|Gatheral and Jacquier (2014). 


3We recall that the log-strike k of a vanilla option of strike K and forward price F = Spe" is defined as 
k = log (F) 
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3.2. The SSVI parameterization 


Devised at Merill Lynch in 1999 and publicly disseminated by [Gatheral] (2004), the Stochastic Volatility 
Inspired (SVI) parameterization is a popular parameterization of the implied volatility smile. To be 
more precise, it is a parameterization of the total implied variance that we defined in Theorem |3.1 
The standard formulation of the SVI parameterization is the so-called raw SVI parameterization and is 
presented below. 


Definition 3.2 (Raw SVI). For a given maturity T > 0, the raw SVI parameterization writes: 
w(k,T) = ar + bp (ov(h = mr) + y (k - mr)? + o% ) (3.2) 


where ar € R, br > 0, |or| < 1, mr € R and or > 0. Moreover, the parameters must satisfy 
ar +brory1-— pz > 0 to ensure that the total implied variance remains positive for all k € R. 


The popularity of this parameterization is mainly due to its tractability and its ability to fit market 
implied volatilities quite well. Moreover, it features nice properties such as consistency with Lee’s moment 
formula or the fact that it corresponds exactly to the large-maturity limit of the Heston 
implied volatility smile POTI}. In their paper, 
proposed an extension of the SVI parameterization to address two issues of this parameterization. First, 
the SVI parameterization is not a parameterization of the full total implied variance surface but only of 
a slice k +> w(k, T) for a fixed maturity T since the 5 parameters are all maturity-dependent. Second, at 
the time of the publication of their paper, it seemed impossible to find conditions on the SVI parameters 
that guarantee the absence of butterfly arbitrage (the problem has now been solved by 
[2022}. The extension of the SVI parameterization that they propose to address these issues is 
called the surface SVI (SSVI) and is defined below. 


Definition 3.3. Let p be a smooth function from R¥_ to Rä such that the limit limr—o 0rp(Or) exists 
in R where Or := 0%5(0,T)T is the ATM total implied variance. The SSVI is the surface defined by: 


0 
wk, T) = E (1+ pelOr)k + Vrk +A + (Lp?) (3.3) 
Remark 3.2. By abuse of notation, we use the same notation ogg for the implied volatility as a function 
of the strike or the implied volatily as a function of the log-strike. 


While the SVI parameterization requires 5 parameters for each slice of the IVS, the SSVI relies on the 
function y and the parameter p € (—1, 1), that do not depend on the maturity, as well as one parameter 
Or for each maturity that depends on the maturity but could be considered as set prior to the calibration 
since the ATM total implied variance for the traded maturities can be directly observed on the market. 
Note that [Hendriks and Martini] (2017) propose to consider a maturity-dependent p parameter in order to 
improve the calibration accuracy for very short maturities (typically below 1 month). Since our database 
does not contain short-term data, this extension of the SSVI is not investigated in this paper. 


Remark 3.3. For a fixed T > 0, the corresponding raw SVI parameterization is given by (ar, br, pr, Mmr, or) = 


(0 p), LD pe vas): 


2 er)’ (0r) 


Gatheral and Jacquier provide sufficient conditions for the SSVI to be free of arbitrage. These 
conditions are presented in the following theorem. 


Theorem 3.2 (Corollary 4.1 from |Gatheral and Jacquier} (2014)). The SSVI is free of static arbitrage 


if the following conditions are satisfied: 


(i) rðr > 0 for all T > 0; 


(ii) 0 < d(O) < $ (1+ VI—P) p(0) for all 6 > 0; 
(itt) Op(0)(1 + |p|) < 4 for all 0 > 0; 
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(iv) 00(0)7(1 + |p|) < 4 for all 6 > 0. 


Remark 3.4. The conditions (i) and (ii) actually are necessary and sufficient conditions for the absence 
of calendar spread arbitrage for the SSVI. The condition (iti) with a non-strict inequality is a necessary 
condition for the absence of butterfly arbitrage but condition (iv) is only a necessary condition if Op(6)(1+ 


lol) = 4. 

Remark 3.5. The conditions (ii), (iii) and (iv) can be weakened as follows: 
(ii) 0 < Ae(09(0))lo=0r < + (1+ VIP") G(r) for all T > 0; 

(iii) Ory(Or)(1 + |p|) < 4 for all T > 0; 

(iv) Ore(Or)?(1 +4 |p|) < 4 for all T > 0. 


Note that these conditions are not necessarily equivalent to the ones in Theorem|[3.2| since T > Or is not 
necessarily a bijection from Ri. to IR’. 


A natural question at this stage is how to choose the function ọ in order to both achieve a good 
fit to market data and satisfy the above conditions. The authors propose three examples of parametric 
form for y: 


e the Heston-like parameterization y(0) := $ (1 — ~) with A > 0, 


e the power-law parameterization (8) := g5 with n > 0 and 0 < y < 1, and 


e the modified power-law parameterization (0) := pyp With n > 0 and 0 < y < 1. 

For simplicity of reference to these parameterizations, we abbreviate the SSVI with the Heston-like 
parameterization to SSVI-HL, the SSVI with the power-law parameterization to SSVI-PL and the SSVI 
with the modified power-law parameterization to SSVI-MPL. The following propositions translate the 


sufficient conditions of Theorem for these three parameterizations. Their proof can be found in 
Appendix [A] 


Proposition 3.1. The SSVI-HL is free of static arbitrage if ðrOr > 0 for all T > 0 and A> (1+|p|)/4. 


Proposition 3.2. Assuming that Orr > 0, we have the following cases: 


(i) Ify € (0,1/2), there exists 6,03 > 0 such that the SSVI-PL is free of static arbitrage if 0r < 03 ^03 
for all T > 0. 


(ü) Ify € (1/2,1), there exists 05,05 > 0 such that the SSVI-PL is free of static arbitrage if 05 < 0r < 
6; for all T > 0. 


(itt) Ify =1/2 and 7?(1 + |p|) < 4, there exists 0} such that the SSVI-PL is free of static arbitrage if 
Or < OF for all T > 0. 
Proposition 3.3. Assuming that rðr > 0, we have the following cases: 


(i) If y € (0,1/2) and n(1 + |p|) < 4, there exists 0* > 0 such that the SSVI-MPL is free of static 
arbitrage if, for all T > 0, 0r > 0*. 


(ii) Ify € (1/2,1), the SSVI-MPL is free of static arbitrage for n(1+|p|) < 4 and (1—2y)y(1—2y)?(1+ 
lel) < 4. 


(itt) Ify = 1/2, the SSVI-MPL is free of static arbitrage for n? (1 + |p|) < 4. 
Remark 3.6. Note that different sufficient conditions could be found for guaranteeing the absence of 
static arbitrage by restricting the maturity T to some subset of Rä. Since it is not very satisfying to have 


an IVS parameterization that could be arbitrable for some maturities, we looked as much as possible for 
conditions that do not restrict the values of T. 
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Remark 3.7. For 7 = 1/2, the SSVI-PL and the SSVI-MPL parameterizations induce a power-law 
decay of the ATM volatility skew for small maturities which is a well-known stylized fact of implied 


volatility surfaces (see e.g. |Gatheral et al., |2023). Indeed, recalling that the ATM volatility skew is 


defined by Opogs(0,T), it is straightforward to show that O,ops(0,T) = ZTV Ory(Or) for the SSVI 
parameterization (no matter the choice for p). Therefore, we have that ôo gs(0, T) = Wan for the SSVI- 


PL and ôkogs(0, T) = WET = IST Ho (+) for the SSVI-MPL assuming that limr_0 Or = 0 


which is a natural assumption: an ATM option with zero time to expiry has no value. 


3.3. Calibration results and introduction of the parsimonious SSVI model 


In this section, we show to which extent the SSVI parameterization allows to replicate the historical IVSs 


that we presented in Section [2.1] As already noted by|Cont and Vuletié| (2023), IVSs from data providers 


are not necessarily arbitrage-free because of interpolations of actual market quotes. Procedures such as 
the ones of or {Cohen et al.| allow to detect arbitrages in a finite set of 
prices of European call options given the forward prices. Our database does not contain short rates data 
or forward prices data so we cannot use these procedures. Since calendar spread arbitrages are equivalent 
to the total implied variance being non-decreasing in maturity, we remove the IVSs (we recall that our 
data sets contain one IVS per business day) such that there is at least one crossing between the linearly 
interpolated total implied variances of two adjacent maturities. IVSs with butterfly arbitrages (if any) 
are not removed as the condition in terms of total implied variance is much more complicated to verify 
(see condition (iii) of Theorem 8.1). In total, 6.4% (resp. 5.6%) of the IVSs are removed from the S&P 
500 (resp. Euro Stoxx 50) data set. 


We start by comparing the three parametric forms of the function y that we introduced in the 
previous section. For this purpose, we calibrate the SSVI for each day of our data sets without calendar 
spread arbitrages and for each parametric form of y by solving the following minimization problem using 
the minimize function with the SLSQP algorithm from the scipy Python package: 


min NO XO glk) (omrlk, Ti) — ossvi(k, T; 9) 
©=((O7, )i=1,...,M Pty) ee. 
s.t OT, > 0 (3.4) 
Ora > Or, forl<i<M-1 
(olp) € Cy. 


We used the following notations: 


e Il, is the vector of the parameters in y: only A for the SSVI-HL and (7,7) for the SSVI-PL and 
the SSVI-MPL. 


e Ti <T)<.--- < Ty is the set of maturities and Kr is the set of log-strikes for the maturity T. 


(k) is the standard normal density function evaluated in the log-strike k. This weighting function 
allows to give more weight to the replication of the implied volatilities that are close to the money. 
Another choice based on the Black-Scholes vega has been considered but we did not notice any 
improvement. 


Om«t is the market implied volatility. 


ogsvi1(-,:;Q) isthe SSVI implied volatility associated to the parameter vector © = ((O7, )i=1,...,M, P, 
e Cy is given by: 

— Cy = {(p, A) € (-1,1) x R4 | à > (1 + |p|) /4} for the SSVI-HL, 

— Cy = {(p, 0, Y) E€ (—1,1) x R x (0,1) | 7?(1 + |p|) < 4 if y = 1/2} for the SSVI-PL and, 


n(1 + |pl) <4 
Co = § (9,77) € (-1,1) x R4 x (0,1) uy + |p|) < 4 and (1— 2y)p(1 — 27)?(1 + lal) < 4 
n 


if y < 1/2 
if y > 1/2 
if y= 1/2 


for the SSVI-MPL. 


Remark 3.8. The constraints in the above minimization problem do not guarantee the absence of static 
arbitrage for the power-law and the modified power-law parameterizations as far as only the constraints 
involving p, n and y are included as well as the non-decreasing property of T +> Or. The reason for 
this choice is the fact that there is no closed-form expression for 0*, 0; and 0% so the addition of the 
constraints involving these terms would strongly complexify the numerical optimization. By the end of 
this section, the study will be restricted to the SSVI-MPL parameterization with y = 1/2 (for which there 
is no constraint involving 0*, 0} or 03) so this choice is impact-free. 


Remark 3.9. We use in practice the change of variable (Õr)i=1 aA m where Or, = r, and Or, = 07,—07,_, 
for i € {2,...,M} since it allows to transform the second inequality constraints in the minimization 
problem (3.4) in bounds constraints: Or, > 0 for2<1<M. 


The initial guess for each parameter is provided in Table [2] The average relative errors (without weight- 
ing) between the market implied volatilities and the SSVI implied volatilities for each day of our data sets 
are presented in Figure [2] It appears clearly that the Heston-like parameterization performs very poorly 
in comparison to the two others parameterizations. This results from the fact that, in the Heston-like 
parameterization, the function y is bounded from above by 1/2 (see Appendix [A.1} which constraints 
strongly the shape of the IVS. The power-law and the modified power-law parameterizations achieve 
essentially the same accuracy which is very stable over time. In particular, we do not observe a decrease 
of the fitting quality during the Covid-19 crisis. In Figure we illustrate how the SSVI-MPL fits the 
S&P 500 (resp. the Euro Stoxx 50) total implied variances for a day where the calibrated average relative 
error is equal to 1.19% (resp. 1.18%) corresponding to the mean of the average relative errors across the 
whole S&P 500 (resp. Euro Stoxx 50) data set. 


Table 2: Initial guesses provided to the optimization routine. 
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Figure 12: Average relative errors between the market implied volatilities and the SSVI implied volatilities 
after the calibration. 


At this stage, let us recall that our final objective is to design a model to jointly simulate arbitrage- 
free IVSs and the price of the underlying asset by simulating the evolution of the SSVI parameters as 
a function of the path of the underlying asset price. However, the number of parameters involved is so 
large that a model for their joint evolution would be too complicated. Therefore, we need to make the 
SSVI model more parsimonious. We propose the two following simplifications: 
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(a) S&P 500 (October 19, 2016) (b) Euro Storr 50 (February 18, 2022) 


Figure 18: Illustrations of the fit of the SSVI with the modified power-law parameterization on market 
total variances for all maturities. The dots (resp. the lines) correspond to the market (resp. the SSVI) 
total implied variances. 


1. We consider only the modified power-law parameterization with y fixed to 1/2 because we observe 
that y is close to this value over the two data sets (more than 84% of the calibrated y’s lie within 
the [0.4, 0.6] interval). Moreover, this choice has the advantage of guaranteeing that the full surface 
(without any restriction on the ATM total implied variance) is free of static arbitrage on the sole 
condition that 7?(1+ |p|) < 4. Finally, according to Remark setting y = 1/2 implies a 
power-law decay of the ATM volatility skew which is a known stylized fact. 


2. We assume that 07 = aT? where a,p > 0 reducing considerably the number of parameters while 
ensuring that the no-arbitrage constraint on the ATM total variance is always satisfied (Or0r > 0). 
This parametric form is inspired by the calibrated vectors (Or, )i=1,...,4 that exhibit almost a linear 
behavior with the maturity T. In Figure [14| we show how this parametric form fits the S&P 500 
ATM total variances for several dates (the fit is similar for the Euro Stoxx 50 so it is not shown 
here). Note that the parameters a and p that have been used in this figure are those calibrated 
on the whole IVS so the quality of the fit is reduced in comparison to a calibration on the ATM 
volatilities only. Despite this, the fit is overall satisfying although the concave shapes of the ATM 
total variances in Figure [I4b] cannot be well reproduced. 
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(a) Comparison for the 5 dates where the average 
relative error is the closest to the mean over all (b) Comparison for the 5 dates where the average 
average relative errors. relative error is the highest over the whole data set. 


Figure 14: Comparison of the S&P 500 ATM total variances (dots) with the fitted parametric form 
T + aT? (lines) for some dates (each color corresponds to one date). 
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The new model obtained after these simplifications is called thereafter the parsimonious SSVI model. 


Definition 3.4 (Parsimonious SSVI). The parsimonious SSVI is the parameterization of the total implied 
volatility surface defined by: 


wlk, T) = (1+ pelOr)k + Ver yk + oP + A). (3.5) 


= P = n ) > 
where 0r = aT? and (0) CEES) with a,p > 0 andy > 0. 
In Figure[15] the average relative errors between the market implied volatilities and the SSVI implied 
volatilities for each day of our data sets obtained for the parsimonious SSVI model are compared to those 
obtained for the SSVI model with the modified power-law parameterization. As expected, the calibration 
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Figure 15: Average relative errors between the market implied volatilities and the SSVI implied volatilities 
for the parsimonious SSVI model and the SSVI-MPL model. 


accuracy is reduced for the parsimonious SSVI. However, it remains overall quite close to the SSVI-MPL 
calibration in view of the reduction of the number of parameters: 4 parameters for the parsimonious 
SSVI versus 27 for the SSVI-MPL. The mean of the average relative errors across the whole S&P 500 
(resp. Euro Stoxx 50) data set increases from 1.19% (resp. 1.18%) to 1.65% (resp. 1.60%). In Figure [16] 
we show the impact of each simplification on the calibration accuracy for the S&P 500 (it is similar for 
the Euro Stoxx 50). It appears very clearly that the parametric form for 9; is the assumption leading to 
the largest deterioration of the fit to market implied volatilities, which is consistent with the fact that 
this assumption is the one limiting the most the number of degrees of freedom of the model. 


4. Path-dependent SSVI model 


The present section is dedicated to the introduction of a new model for the joint dynamics of an im- 
plied volatility surface and the underlying asset price. The calibration results exposed in Section 
demonstrate the ability of a particular case of the SSVI parameterization - the parsimonious SSVI - to 
fit reasonably well historical implied volatility surfaces while guaranteeing the absence of static arbitrage 
with only 4 parameters. As a consequence, we propose to specify our model as a dynamic version of the 
parsimonious SSVI: each parameter of the parsimonious SSVI is considered as a stochastic process whose 
dynamics remains to be determined. One option to jointly model the evolution of the parsimonious SSVI 
parameters and the underlying asset would be to introduce a correlation between the random noises driv- 
ing each process. However, in view of the empirical study conducted in Section [2] which indicates that 
there is a feedback effect of the past returns and the past squared returns of the underlying index price 
onto the level of the ATM implied volatility, we prefer another option. Instead of using a correlation, the 
idea is to explicitly model the response of each of the 4 parameters to the evolution of the underlying 
asset price. To this end, we measure to which extent the trend feature and the volatility feature of the 
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Figure 16: Impact of each simplification in the parsimonious SSVI model on the average relative errors 
between the S&P 500 market implied volatilities and the SSVI implied volatilities. 


path-dependent volatility (PDV) model presented in Section allow to explain the variations of the 
4 parameters of the parsimonious SSVI model. This study is presented in Section below. Then, 
Section introduces a variant of the PDV model of Guyon and Lekeufack and the dynamics of the 
parsimonious SSVI parameters. Finally, Section [4.3] and [4.4] detail the calibration and the simulation of 
the model. 


4.1. Path-dependency of the parsimonious SSVI parameters 


The calibration of the parsimonious SSVI in Section [3.3] provides the daily evolutions of the parame- 
ters a, p, p and 7. Based on these daily evolutions, we can calibrate the PDV model where we 
replace Volatility, in Equation by each parameter of the parsimonious SSVI. Note that we con- 
sider the logarithm of p instead of p in the PDV model since we observed that this provides a better 
fit. The calibration methodology is the same as the one exposed in Section Moreover, similarly 
to the study in Section for each parameter of the parsimonious SSVI, we run a 10-fold blocked 
cross-validation on the train set to determine the optimal hyperparameters Cr,, Cy and A in the grid 
{5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2000, 2500}? x {10-°,10~°,..., 1071}. The R? scores obtained on 
the train and the test sets for each parameter and for each index are reported in Table On the one 
hand, these results show that the time evolution of the parameter a is well explained by the evolution 
of the underlying asset price both on the train and the test sets, which is line with the results obtained 
for the ATM implied volatility since a captures the ATM total variance level (whose evolution is similar 
to the one of the ATM implied volatility and as such we expect the study conducted in Section to 
be still valid for the ATM total variance). The same observation holds for p. However the fact that 
the PDV model works well for p was not anticipated as it allows to parameterize how the ATM total 
variance increases with the maturity and it is not homogenous to the ATM total variance level. Thus, 
we emphasize that this is a key finding. On the other hand, the R? scores for the parameters p and 7 are 
small on the train set and negative on the test set (except for p on the Euro Stoxx 50). This indicates that 
the trend and the volatility features are not related to these two parameters. These observations come as 
no suprise: p and 7 parameterize respectively the orientation and the convexity of the implied volatility 
smile as illustrated in Figure [17] It is therefore less clear how the past variations of the underlying asset 
price could impact these parameters. 


Remark 4.1. |Guyon and Lekeufack, (2023) and (2024) considered a third feature 


given by Ril, R, >0} in the PDV model in order to achieve a satisfying joint SPX/VIX fit. Adding this 
feature to explain the variations of our 4 parameters does not improve the R? scores. 


An idea to explain the variations of p and 7 is to consider skewness and kurtosis features. Indeed, 


Bouchaud and Potters} (2003) showed a cumulant expansion formula for the Bachelier implied volatility 
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Table 3: R? scores of the PDV model on the parameters of the parsimonious SSVI on the train and the 
test sets. 


S&P 500 Euro Stoxx 50 
(Cr, , Cs, X) Train Test (Cr, , Cs, X) Train Test 

a  (25,1000,10-°) 91.8% 51.1% (10,1500,10-) 85.7% 49.4% 

p  (100,2000,10-3) 39.4% 48.5%  (250,50,10-') 76.3% 75.3% 

p _ (100,5,10-') 6.2% -68.0% (10,25,1073) 35.9% 19.7% 

n (100,1500,1075) 6.2% -33.6% (1000,1500,1073) 58.0% -899.5% 
i E n 
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Figure 17: Influence of the p and n parameters on the shape of the implied volatility smile computed from 
the parsimonious SSVI (a and p are respectively fixed to 0.05 and 1.3). 


allowing to explain the presence of the volatility smile and its shape using the skewness and the kurtosis 


of the underlying asset price distribution. Independently, |Backus et al.| (2004) showed a similar formula 


for the Black-Scholes implied volatility. This formula writes: 


ops(k,T) 20 (1 z Sd- ma = @)) (4.1) 


where o, ug and u4 are respectively the standard deviation, the skewness and the kurtosis of the log- 
under the risk-neutral probability and d = —k/o + 0/2. Replacing d by its expression in 


return log 
Equation (4.1) yields: 
ni 2 1+ k k? H4 1 


k 
k, T) S04 i i s, 4.2 
ops(k,T) =o + ous — gT ag HI ta X t gtl (4.2) 


By analogy, we may consider the following regression model: 
K 
Xi = Bo + BrEe + BoSe + B3St EF + PaKiEr + Bas + BoD} (4.3) 
t 


where X; is the value of either p or 7 at time t, X; is the volatility feature (1.3) of the PDV model and 
Sı, Ky are respectively skewness and kurtosis features defined as: 


= Duci K3(t = re, 
= 3 , 


_ nct Ka(t — tori, 
= oH : 


Sı 
(4.4) 


Ki 


Note that the three kernels K2, K3 and K4 are assumed to be TSPL kernels. For the sake of simplicity, 
the cut-off lag is set to 1000 for all kernels and the penalization A is set to zero. The R? scores resulting 
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Table 4: R? scores of the regression model for p andn. 


S&P 500 Euro Stoxx 50 
5-fold blocked CV Train Test Train Test 
No 43.6% -57554.5% 41.2% -219.8% 
P Yes 46.6% -22464.2% 44.1% -104.3% 
No 60.4% 75.7% 74.7% -59.4% 
i Yes 64.4%  -3997.2% 83.5% -169.5% 


from the calibration of this regression model are presented in Table Note that we present both the 
scores obtained using the standard splitting of the data sets described in Section and the average 
scores obtained using a 5-fold blocked cross-validatiorf?} We remark that the scores are negative on the 
test set for all instances except for the parameter 7 on the S&P 500 data set when using the standard 
splitting but it becomes negative with the 5-fold blocked cross-validation. To verify that these negative 
scores are not the consequence of an overfitted model, we tested the 2° — 1 non-empty combinations 
of the 6 features in Equation but we mostly obtained negative scores on the test set. For the 
instances where the score was positive on the test set, we ran a 5-fold blocked cross-validation which 
gave systematically negative scores on the test set. Therefore, we conclude that these 6 features are not 
relevant to predict the variations of the parameters p and 7. 


Beyond the interpretation of p as a parameter that controls the orientation of the volatility smile, 
it is also possible to interpret it as the correlation between the two Brownian motions in the Heston 
model (the so-called "spot-vol" correlation) using the convergence of the Heston model towards the SVI 
parameterization (2011). Because of this link, it is reasonable to study whether 
one can explain the variations of the parameter p using the correlation between the underlying price and 
its volatility. As a measure of the volatility, we use the daily realized volatility estimates from 
spanning the period from January 1, 2000 to December 31, 2021. In the same spirit as the 
features of the PDV model, we introduce a correlation feature based on a TSPL kernel: 


T; = pares K(t = ti) (ri = Th) (Ot; = Tt) , 
Duet K(t _ ti) (re ~ Tu)? Danci K(t > ti) (0t, = Tt)? 


(4.5) 


where 7, = oH >a Tti p Ot is the above-mentioned volatility and +, = ant os Ot; p- The cali- 
bration of the model p = Bo + 61T; with a cut-off lag C fixed to 1000 and a penalization A fixed to 0 
again gives a very low R? score both on the train and test sets leading us to the conclusion that there 
does not seem to be any link in practice between the parameter p in the SSVI and the correlation between 
the underlying price and its volatility. 


4.2. Specification of the model for the underlying asset price and the IVS 


Based on the analysis in the previous section, we provide in this section the dynamics of the four 
parameters in the parsimonious SSVI model. Since two of these parameters depend on the past path of 
the underlying asset price, we also need a model for the dynamics of the underlying asset price in order 
to be able to simulate IVSs over time. Because we want these simulations to be realistic, the model for 
the underlying asset price should also be as realistic as possible. We opt for the PDV model of Guyon 
and Lekeufack (more precisely, a variant of their model as we will see) as it allows to replicate almost all 
historical stylized facts of equity prices (leverage effect, volatility clustering, weak and strong Zumbach 


4We considered only 5 folds to limit the computational cost. 
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effects). The asset price (S;);>0 is assumed to evolve as follows: 


Ha = ordW S 
St 
n = ae Te + BEDE + 2 | 
ae ôg dS 
o — ai> u 4.6 
lt J. — u+ ôg) S Su (46) 
Zag og dS, 2 
Pr = 2° 
: ir, (t —uw+69)%2 « (F) 


(t—u+d)¢ 
and ef is a residual allowing to account for the fact that the PDV model does not perfectly explain 
the variations of the spot volatility o,. This specification is similar to the one proposed by 
except that we do not approximate the TSPL kernels by linear combinations of 
exponential kernels. Guyon and Lekeufack make this approximation to recover a Markovian model that 
is very fast to simulate. We choose not to follow suit since we already achieve reasonable simulation 
times as we will show in the numerical experiments. Besides, Guyon and Lekeufack propose to consider 
multiplicative residuals instead of additive residuals, i.e. they specify the dynamics of the spot volatility 
as of = Kr(65 + BY RT, + BIX) where (Kz)¢>0 is an Ornstein-Uhlenbeck process or an exponential 
Ornstein-Uhlenbeck process. The choice of the latter proces: has the advantage of guaranteeing the 
positivity of the spot volatility provided that ő + 67 RI, + 65X7 > 0 which is always the case in 
the simulations for the estimated parameters as already adenine by Guyon and Lekeufack. Again, 
we propose not to follow suit since we observed that the volatility of the simulated underlying returns 
was too low in comparison to the historical volatility when using multiplicative residuals (see Figure 
18a). The use of additive residuals helps to increase it although it is still slightly lower (see Figure 
18b). To further improve the replication of the historical volatility, one could replace the increments of 
the Brownian motion WS with random variables having fatter tails. This modification of the model is 
however not investigated in this paper. The model that we retain for £7 is provided in the next section. 


= 
where o; is the instantaneous or spot volatility, (W£ )+>o is a Brownian motion, Za, = ( a 45) 
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(a) Multiplicative residuals modelled by an expo- (b) Additive residuals modelled by a non-central t- 
nential Ornstein-Uhlenbeck process distribution 


Figure 18: Comparison of the volatility of the simulated returns (using 1000 simulations) at each time 
step of the projection to the historical volatility of the Euro Storr 50. The left graph corresponds to 
simulations of the PDV model with multiplicative residuals modelled by an exponential Ornstein- Uhlenbeck 
process while the right graph corresponds to simulations of model where €° is modelled by a non- 
central t-distribution. Note that different choices of the models (e.g. a Cox-Ingersoll-Ross process for 
the multiplicative residuals and an Ornstein-Uhlenbeck process for the additive residuals) lead to similar 
results. The observed decrease of the simulated volatility on the first time steps results from the fact that 
the simulations are initialized with the historical path of the underlying asset price. 


Second, since both parameters a and p in the parsimonious SSVI model exhibit a path-dependent 
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behavior with respect to the underlying asset price, we propose the following dynamics for these param- 


eters: 

a, = Ke (BG + BTRI, + BX?) 

pe = k}? exp (65 + pi RI + Bed?) 

p= f =- x dSu for i € {a, p} 

m a 2 JA Su r (en) 
Za; ði dsa N? , 

y= = x for i € {a, 
i Te (t — u + 64) (=) tap} 


where «° and k?” are time-dependent multiplicative factors allowing to capture the variations in a and p 
that are not due to the past movements in the underlying asset price. Note that we consider different 
TSPL kernels parameters for o, a and p. Choosing to have common features Rı and X for ø, a and p 
with parameter-specific 8’s is also an option but it requires to simultaneously calibrate the PDV model 
on the three time series and it would probably reduce the R? scores in comparison to the ones obtained 
with a calibration of the PDV model for each of the three variables. 


The four quantities whose dynamics have yet to be specified are the two multiplicative factors K“ 
and «? as well the two parameters p and 7 of the parsimonious SSVI model. The historical evolution of 
these four quantities (see Figure [20] in the following section) reveals that there are some periods where 
the four parameters become simultaneously more volatile and take more extreme values. The most 
striking example of this is the period from May 2016 to July 2017 for the S&P 500. Although this 
period is difficult to associate to any major event on the financial markets, it is not an artifact of the 
parsimonious SSVI calibration (although the fitting error is larger on this period as one can see in Figure 
15a). Indeed, the raw implied volatility data exhibit significant changes in the shape of the IVS during 
this period as illustrated in Figure In order to capture this phenomenom within the modelling, we 
propose to consider a hidden semi-Markov model with two states. While the time spent in a given state 
is exponentially distributed in a true Markov model, a semi-Markov model allows one to choose the 
distribution of the sejourn time in each state, thus enabling to produce long sejourn times. Let us denote 
by (Jt)t>0 a semi-Markov process with two states (1 and 2) and let us set X; = (Kf, K}, p,m). The 
random variable J; can be interpreted as the unobserved economic regime or state in which the process 
X is at date t. The dynamics of (X;)+>0 is specified as follows: 


dX; = diag(Ny,)(Mr, — X;)dt + diag ( F(X) Ty,awx (4.8) 


where for i € {1, 2}, 
e N; is a vector of size 4 representing the mean-reversion speed of X in the regime i, 
e M; is a column vector of size 4 representing the mean-reversion level of X in the regime å, 


e T; isa lower triangular matrix of size (4, 4) such that [T° is the covariance matrix of the Brownian 
terms driving X in the regime å, 


and 


y! 0 
e diag(Y) = ta for Y = (Y !,Y?, Y’, Y4) € R$, 
0 y4 


© VFX) = (VRT Vst V1 = PC + pi), vV), 


e (W;*)ts0 is a 4-dimensional Brownian motion. 


Note that conditionally on {I; = i, Vt > 0}, (Kf )e>0, (KP )e>0 and (m)t>o are Cox-Ingersoll-Ross (CIR) 
processes while (p;):>0 is a Jacobi process lying between -1 and 1. The choice of CIR processes for «° 
and K? is motivated by the fact that these two quantities should be positive to guarantee the positivity 
of a and p respectively which in turn ensures the no-arbitrage constraint Orr > 0. Similarly, 7 should 
be positive for the modified power-law parameterization (0) = FOIS to also be positive. Finally, 


p should be in (—1, 1) by definition of the SSVI parameterization, hence the use of a Jacobi process. 
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Figure 19: Illustration of the change in the shape of the S&P 500 market total implied variance surface 
motivating the introduction of a regime-switching process. 


4.3. Model calibration 


This section details a calibration methodology for all the parameters involved in the path-dependent 
SSVI model whose dynamics has been specified in the previous section. Starting with the spot volatility 
a, the features RJ and X7 in Equation are discretized and truncated as follows: 


Zas ôT Zag og 
fe S es Sa ee (4.9) 
t-OF, <ti<t (t — ti + EO hae t-Ceet<t (t -ti + bg )°2 


St; — 


a 


se The parameters (af, 67,035,069, 65, 67, 65) are then estimated using the approach 
described in Section [2.2] (except that we use the full data set instead of splitting it into a train set and a 
test set). As a proxy of the spot volatility, we use the daily realized volatility estimates of [Heber et al.| 
since we consider that they represent the best proxy of instantaneous volatility that we have access 
to. Note that we use only the past returns until time ¢ to predict the realized volatility at time t + A 
where A = 1 day since the return at time t depends on the volatility at time t (this is also the approach 
implemented by CRS ee The cut-off lags CZ, and CZ and the penalization A7 
are estimated upstream using a 5-fold blocked cross-validation following the methodology described in 
Section [2.3.4] From this first calibration, we deduce the historical time series of £7 as the differences 
between the "true" realized volatilities and the predicted realized volatilities. Since this historical time 
series present a very small autocorrelation and its empirical distribution exhibit a right fat tail, we 
model the residuals £7 as i.i.d. non-central t-distributed random variables with noncentrality parameter 
c, number of degrees of freedom k, location u and scale y: 


where ry = 


Et = l tY 4.10 
PS UN TTE (4.10) 
where Y is a standard normal random variable and V is an independent chi-square random variable with 


k degrees of freedom. The four parameters (u,7,c,k) of the non-central t-distribution are estimated 
using a numerical maximization of the log-likelihood. 


The historical time series of the parameters a, p, p and 7 of the parsimonious SSVI model are 
obtained in Section We again rely on the approach described in Section [2.2] to calibrate the a’s, 6’s 
and 6’s in Equation (4.7). This actually corresponds to the calibration performed in Section with 
the difference that we do not split the data set in train and test sets and we only consider only until 
December 31, 2021 as the realized volatility data set ends at this date and we want to be consistent 
across the calibrations on different data sets. We deduce the historical time series of k“ and «? from this 
calibration. Let us denote by (t := kA),=0,....n the time grid on which the process X = (K, KP, p,n) is 
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observed. Note that A = 1/252 (1 business day) in our case. The corresponding observations of X are 
denoted by 2,,...,%z,,. Endowed with the historical evolution of the vector X, we calibrate the hidden 
semi-Markov model using the EM algorithm described by |Guédon| (2003). To this end, we consider 
a semi-Markov chain where the sejourn time d; in the state i € {1,2} follows a Zipf distribution: 


1 1 
Ay, s; us 


dilu) = PU. Zi lags =i Vo E {2,...,0} | gg, =i, Zi) = u=1,...,U (4.11) 


where Hy, = D k~* is the U-th generalized harmonic number of order s. Note that we consider a 
discrete semi-Markov chain for simplicity but its continuous equivalent (a truncated Pareto distribution) 
could be chosen instead if one wanted to be able to simulate the model for any discretization time 
step. The parameter U is fixed to 5000 which seems to be a safe cut-off and the parameters (s;);=1,2 are 
included in the set of parameters estimated by the EM algorithm. We also include the initial distribution 
(& := PU, =i | Xto = £to ))i=12 in the set of estimated parameters. In total, the following parameters 


are estimated by the EM algorithm: (Ne), i=1,2 , (M? ), i=1,2 , (T2*) sere , (si)i=1,2 and (€)ia12. 
=1,...,4 jk=1,...,4 


veot o JSL, RhRSL,.., 


In order to find a good starting point for the EM ortha we start by ina independently each 
component of X with the EM algorithm so that the estimation of the model ( can be decomposed 
in two steps: 


1. estimation of the parameters of each of the four components of X and 


2. estimation of the complete model starting from the parameters estimated in the first step or their 
means over the four components. 


The initialization of the EM algorithm for the component j € {1,2,3,4} of X in the first step is described 
below: 
e Initialization of (N’);=12, (M/);=12 and the CIR or Jacobi volatilities Ç = ey) 
i=1,2 
The K-means clustering algorithm with K = 2 is TOTS to the historical time series of X/ to 
infer the state of each data point. Then, [Wei et al. Pe ct al pora ’s MLE estimator is used to estimate 
the parameters (N?);— 1,2, (M? Ji=1,2 and (% )i=1,2. 


t 


e Initialization of (s;)i=1,2: We set sı = 1.5 and s2 = 2. 


e Initialization of (€;);21,2: We set ¿& = 1 if the state identified by the K-means clustering 
algorithm at tı is 1 and 0 otherwise. 


In the presentation of his EM algorithm for estimating hidden semi-Markov chains, con- 
siders a non-parametric observable process X whose samples X;,,,...,Xt, are independent conditionally 
on the hidden semi-Markov chain J. In our case, the distribution of X is parametric and the samples 
are not independent conditionally on J. This implies that the maximization step has to be adapted to 
our specific setting. For this purpose, we rely on the approach described by Janczura and Weron] 
who discretize the dynamics of a constant elasticity of variance (CEV) model using the Euler-Maruyama 
scheme to get explicit formulas of the parameters estimators. 


Once the parameters of each component of X have been estimated, we initialize the EM algorithm 
for the complete model (4.8) as follows: 


e Initialization of (N?) i=1,2 5 (M?) i=1,2 : We use the parameters calibrated in the first step. 
j=1,...,4 1.4 
e Initialization of (T rm), fate ; We average the 4 smoothed probabilities P(I;, = i | Xi, = 


yery 


Tto: -3 Xtn = Tt, ) for k ti. n} obtained in the first step ee gives us an approximation 
of what is the most likely state a all dates. Denoting by to,- --,tm the dates where the most 
likely state is 2, we compute the residuals of each component j for each regime 7 as 


BF a (4.12) 


Note that we only consider the pairs of dates (ti, ti, 41) that are consecutive, i.e. such that ti, 3a 
t, = A in order to put aside the pairs of dates between which the most likely state is no longer i. 
Finally, we set the initial value of the matrix (re F) as the lower triangular matrix in the Cholesky 
4 where oF is the empirical correlation between 


oer of the matrix (ĈIF 44E); k=1, 
cèi and eb 


e Initialization of (s;)i=1,2 and (€;);<1,2: We average the values obtained in the first step for each 
component. 


In the multivariate case, it is no longer possible to get explicit formulas of the estimators of (N; j ye i=1,2 5 


(Mi ) i=1,2 and (re* ) 4=1,2 in the maximization step of the EM algorithm. However, since X+, given 


j=l,.-.,4 j,k=1,...,4 
Xi,_, = Ttr and I, = i has a multivariate normal distribution when discretizing the SDE (4.8) with 
the Euler-Maruyama scheme, we can compute the conditional density explicitly. Therefore, we rely on a 
numerical optimization procedure for the maximization step of the parameters of X. 


4.4. Numerical results 


The aim of this section is to provide some evidence of the consistency of the proposed model with his- 
torical IVSs data. 


In Figure we start by showing the historical evolution of the parameters k*, kP”, 7 and p that 
compose the hidden semi-Markov process X as well as the most likely state (obtained using the smoothed 
probabilities P(I;, = i | Xto = Tto; --, Xt, = £t, )) for each date after the calibration. These graphs 
show that the periods of high volatility are correctly identified by the model. 
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Figure 20: Historical evolution of the four components of X for the S&P 500 and most likely regime at 
each date after the calibration of the hidden semi-Markov model. 


Using the calibrated parameters, we first simulate trajectories of X with a daily time step condition- 
ally on the path of the S&P 500 index between August 26, 2004 and December 31, 2021 and the path of 
the Euro Stoxx 50 index between September 27, 2006 and December 31, 2021 (the period from March 
8, 2012 to December 31, 2021 corresponds to the one being used for the calibration of the parsimonious 
SSVI parameters and the period before is the one required for computing the features R, and X). The 
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sejourn times are simulated using the function rvs of the Python package scipy.stats.zipfian and 
they are assumed to be independent from all other random sources. The CIR processes K“, «? and 
ņ are simulated using the explicit scheme E(0) of and with a discretization time step 
given by A/100 with A = 1/252 to ensure that the discretization error remains limited given that the 
estimated volatility and mean-reversion speed are large. Lastly, the Jacobi process is simulated using 
the full truncation Euler scheme of with the same discretization time step. In Figure 
we compare the historical evolution in time of the ATM implied volatility curve as a function of the 
maturity (in the sequel, we refer to this curve as the IVS ATM term structure) to the one of a sample 
path of the path-dependent SSVI model. We observe that the historical and the simulated paths are 
visually very close in terms of the level, the amplitude of the variations, the regularity and the overall 
shape. Moreover, the spikes of the implied volatility due to a drop in the underlying asset price are well 
reproduced. 


Then, we simulate the complete path-dependent SSVI model, i.e. the underlying asset price is also 
simulated according to Equations (4.6). This adds two random sources, namely W% and €”, which we 
correlate to W~ as follows: 


1. We compute the residuals associated to the dynamics of the underlying asset price as: 


Se. 1,2 
gin ta^ 


ai A 


where S; and o; denote here the historical values of the underlying asset price and the realized 
volatility respectively. 


log 
ef = 


(4.13) 


2. We estimate the empirical correlations between €° (resp. & = 671 (Fou,y,c,k) (€7)) with ¢7! the 
inverse of the standard normal cumulative distribution function and F(,,,c,4) the cumulative dis- 
tribution function of a non-central t-distribution with parameters (ju, y,c,&)) and the components 
of the vector (e*",e*",e?,e”) where the residuals e*", e*”, eP and e” are calculated according 
to Equation a Note that we assume that the correlations between £9 (resp. €7) and the 
residuals of X do not depend on the state of the hidden semi-Markov chain. 


3. By combining these correlations estimates with the correlations between the components of X es- 
timated by the EM algorithm in the two states of the semi-Markov chain, we obtain the correlation 
matrix of the vector (e°%,€7,e"",e”,e,e") in each state. Although the empirical correlation €’ 
and £ lies at -25% for the S&P 500 and -19% for the Euro Stoxx 50, we set it to zero to avoid 
the introduction of a decreasing trend in the underlying price paths. The obtained estimation of 
the correlation matrix in each state is positive definite both for the S&P 500 and the Euro Stoxx 
50 allowing to use the Cholesky decomposition to correlate the random variables. 


Since the model depends on the past evolution of the underlying asset price, we initialize our simulations 
using the evolution of the S&P 500 between August 26, 2004 and March 8, 2012 and the evolution of 
the Euro Stoxx 50 between September 27, 2006 and March 8, 2012. In Figure [22| we show the evolution 
of the ATM term structure of two IVS sample paths obtained through the procedure described above. 
Again, we obtain a very convincing evolution which shows that the dynamics of the underlying asset 
price is also realistic. Note that there is nothing in the model or in the simulation that guarantees that 
the no-arbitrage condition 7?(1+ p) < 4 is satisfied. Nevertheless, over 1000 simulations over 11 years 
with a daily time step, we only have 0.27% (resp. 0.006%) of the pairs (p+, +) that do no satisfy this 
condition for the S&P 500 (resp. the Euro Stoxx 50). Besides, let us recall that it is only a sufficient 
condition for absence of static arbitrage in the SSVI parameterization. Therefore, the IVSs that do not 
satisfy it are not necessarily arbitrable. In order to guarantee the absence of arbitrage, one can_set 7 
to \/4/(1+ |p|) when the no-arbitrage condition is not satisfied in the simulation. In Figure we 
provide the quantile envelopes of the ATM implied volatility for the maturities 1 month, 12 months 
and 24 months using this retreatment. These graphs demonstrate that the range of simulated values is 
reasonable in view of the historical path. The decreasing trend at the beginning of each graph results from 
the initialization of the simulations with the historical underlying price path. In terms of computational 
cost, running 1000 simulations over an horizon of 11 years with a daily time step (and a finer time step 
for X as discussed earlier) takes approximately 15 minutes for 24 maturities and 11 strikes on a computer 
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Figure 21: Comparison of the historical IVS ATM term structure to the one of a sample path of the 
path-dependent SSVI model conditionally on the underlying historical path. 
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Figure 22: ATM term structure for two IVS sample paths in the complete path-dependent SSVI model. 


equipped with an Intel Core i7-11850H, 16 cores, 2.6GHz. Two-thirds of the time is needed to simulate 
the random variables and the last third to diffuse the processes. The model is implemented in the Python 
programming language. 


5. Concluding remarks 


Using historical time series of implied volatility surfaces for the S&P 500 and the Euro Stoxx 50, we 
have shown empirically that a large part of the variability of the at-the-money-forward (ATM) implied 
volatility for maturities ranging from 1 month to 24 months can be explained by two features, namely 
the weighted average of the underlying asset past returns and the weighted average of the past squared 
returns. As the maturity increases, the part of variability explained by these two features decreases but 
remains important. Suprisingly, up to four years of the past evolution of the underlying asset have an 
impact on the prediction of the implied volatility. Thus, our empirical study allows to extend the one of 
(2023) who focused on implied volatility indices and realized volatility. In Section 
3.2| we have then introduced a parsimonious version (see Definition of the SSVI parameterization 
of that depends on four parameters only (a, p, p and 7) and that still 
achieves a reasonable fit to the implied volatility surfaces in our two data sets. This parsimonious version 
is essentially obtained by considering a parametric form of the ATM total variance thus avoiding to have 
one parameter per maturity. Moreover, it ensures the well-known power-law decay of the ATM volatility 
skew. In the last section, we demonstrate that the variations of the two parameters a and p ruling the 
ATM implied volatility in the parsimonious SSVI parameterization can also be widely explained by the 
two features that we mentioned earlier. Based on this observation, we introduce a new model for the 
joint dynamics of the underlying asset price and the full implied volatility surface (there is no restriction 
in the range of maturities and strikes that one wants to project) embedding the path-dependency of 
the implied volatility with respect to the underlying price. On the one hand, the underlying asset price 
is modelled using the path-dependent volatility model of with additive 
residuals (i.e. the part of the variability that is not explained by the two features) modelled by i.i.d. 
random variables distributed according to a non-central t-distribution. On the other hand, the residuals 
of the parameters a and p and the parameters p and 7 are modelled through a semi-Markov diffusion 
which allows to reproduce the periods of high volatility of these parameters that we observe historically. 
Extensive details on how to calibrate and simulate this new model are provided. Finally, we show the 
high consistency of the sample paths of this model with historical data and that there is a very small 
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Figure 28: Quantiles envelopes of the ATM implied volatility for several maturities. 


number of arbitrages which can be easily removed so that all simulated IVSs are arbitrage-free. The 
study of impact of this new model for applications in asset management, risk management and hedging 
is left for future research. 
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Appendix A. Proofs of Propositions and 


The three proofs rely on the following lemma. 


Lemma A.1. For all p € [—1,1], we have: 
1 
fo=3 (1+ V1-p?) 21. (A.1) 


Proof. Since \/1— p? > 0, f(p) > ae The result follows from the fact that we assume |p| < 1. 


Appendix A.1. Proof of Proposition 


L PE ye 7 eò? (eò? —A0—1) 
et us start by verifying condition (ii) of Theorem |3.2} We have ô (@p(8)) = — Ssa ~ 2 0. 


Moreover, we can rewrite y as (0) = cee Mi Since > (1 + /1-— P?) > 1 according to Lemma|A.1 
it is enough to check that: 
e “ie — 9-1) < 0 = 1 te 
#e~*9(2 + A0) + AP -—2>0 
to satisfy condition (ii). Let us set (0) = e~*9(2 + A0) + XO — 2. We have (8) = —Ae™>™ (1 + A0) +A 
and y” (0) = A°6e—*®. The second derivative of 7 being non-negative on R}, ~’ is non-decreasing 
on R} and is bounded from below by limg_,o ¥’(@) = 0. Therefore, 7 is also non-decreasing on R+ 


and is bounded from below by limg_,o~(@) = 0. We deduce that condition is satisfied. Hence, 
0 < Oo (Ay(0)) < (8) and condition (ii) is satisfied. 


(A.2) 


Let us now consider conditions (iii) and (iv). The function ¢ is non-increasing on R 1 since 
2e7% cosh >£ dO 
1 L 2 i 
(0) = 5293 ( AO + 2 tanh 5 ) <0 (A.3) 


because tanh x < x on R4}. Thus, y is bounded from above by (0) = 1/2. Consequently, (0)? < y(A) 
and we only need to verify condition (iii). Since 09 (04(0)) > 0, the function 0 ++ Oy(6) is non-decreasing 
and it is bounded from above by the limit limg,,.0y(@) = +. We conclude that condition (iii) is 


A 
satisfied provided that tlel <4. 
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Appendix A.2. Proof of Proposition 


We have 09(9¢(0)) = (1-7) y(0), thus 0 < 09(A¢(8)) < (0) and condition (ii) of Theorem]3.2|is satisfied 


since 2 (1 +y1-— A”) > 1 by Lemma |A.1| Let us now consider conditions (iii) and (iv). We define 
w1(0) = Oy(A)(1 + |p|) — 4 and w2(A) = 0p(0)?(1 + |p|) — 4. The function 7, is clearly increasing with 
w1(0) = —4 and limg_,+.0 Y1(0) = +00 so there exists 0} > 0 such that y (07) = 0. The monotony of 


the function %2 depends on the value of y as #5(0) = (1 + |p|)(1 — 2y)y(8)?: 


e If y € (0,1/2), then we is stricly increasing with W2(0) = —4 and limg_,4.. W2(@) = +00 so there 
exists 63 > 0 such that ~2(03) = 0. 


e Ify € (1/2,1), then wz is stricly decreasing with limg_,9 Y2(0) = +00 and limg_54.0 Y2(0) = —4 so 
there exists 03 > 0 such that W2(03) = 0. 


e If y= 1/2, then wz is constant and equal to 7?(1 + |p|) — 4. 


Proposition [3.2] follows by combining the conditions such that ~1(0) < 0 and %2(0) < 0 and by using 
Remark [3.5] 


Appendix A.3. Proof of Proposition 

We have 09(0y(@)) = 40), thus 0 < 09(@y(9)) < y(@) and condition (ii) of Theorem [3.2] is satisfied 
since 3 (1 4+,/1— ø”) >1by taa This shows also that 0 ++ 0y(0) is strictly increasing. Since 
limg-++00 Oy(6) = 7, we deduce that condition (iii) is equivalent to n(1 + |p|) < 4. Finally, for condition 


(iv), we have: 
2 


00(00(8)") = pare 4 pyc (lt — 8-29). (A) 


Therefore, we have the following cases: 


e If y € (1/2,1), then 0 ++ 0y(0)? is stricly decreasing on R} with limọ—o O9(0)? = +00 and 
limps +00 99(0)? = 0. Thus according to Remark if n(1 + |pl) < 4 then the SSVI is free of 
static arbitrage if 0r > 0* for all T > 0 where 0 satisfies 6*p(6*)? = 4/(1 + |p|). 


e If y € (0,1/2), then 0+ 0:(0)? is stricly increasing on (0,1 — 2y) and then stricly decreasing on 
(1 — 2y, +00), thus it is bounded from above by (1 — 2y)y(1 — 27)?. We deduce that the SSVI is 
free of static arbitrage for n(1+ |p|) < 4 and (1 — 2y)y(1 — 2y)?(1 + |p|) < 4. 


e Ify = 1/2, then 0 6y(8)? is stricly decreasing on R+ and bounded from above by 7?. We deduce 
that the SSVI is free of static arbitrage for n(1 + |p|) < 4 and 7?(1 + |p|) < 4 which is equivalent 
to 7?(1+ |p|) < 4 since for 7 < 1, we have n(1 + |p|) < 4 for all p € [-1, 1]. 
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